22 research outputs found

    Visual Causal Feature Learning

    Get PDF
    We provide a rigorous definition of the visual cause of a behavior that is broadly applicable to the visually driven behavior in humans, animals, neurons, robots and other perceiving systems. Our framework generalizes standard accounts of causal learning to settings in which the causal variables need to be constructed from micro-variables. We prove the Causal Coarsening Theorem, which allows us to gain causal knowledge from observational data with minimal experimental effort. The theorem provides a connection to standard inference techniques in machine learning that identify features of an image that correlate with, but may not cause, the target behavior. Finally, we propose an active learning scheme to learn a manipulator function that performs optimal manipulations on the image to automatically identify the visual cause of a target behavior. We illustrate our inference and learning algorithms in experiments based on both synthetic and real data.Comment: Accepted at UAI 201

    Automated Macro-scale Causal Hypothesis Formation Based on Micro-scale Observation

    Get PDF
    This book introduces new concepts at the intersection of machine learning, causal inference and philosophy of science: the macrovariable cause and effect. Methods for learning such from microvariable data are introduced. The learning process proposes a minimal number of guided experiments that recover the macrovariable cause from observational data. Mathematical definitions of a micro- and macro- scale manipulation, an observational and causal partition, and a subsidiary variable are given. These concepts provide a link to previous work in causal inference and machine learning. The main theoretical result is the Causal Coarsening Theorem, a new insight into the measure-theoretic structure of probability spaces and structural equation models. The theorem provides grounds for automatic causal hypothesis formation from data. Other results concern the minimality and sufficiency of representations created in accordance with the theorem. Finally, this book proposes the first algorithms for supervised and unsupervised causal macrovariable discovery. These algorithms bridge large-scale, multidimensional machine learning and causal inference. In an application to climate science, the algorithms re-discover a known causal mechanism as a viable causal hypothesis. In a psychophysical experiment, the algorithms learn to minimally change visual stimuli to achieve a desired effect on human perception.</p

    Generalized Regressive Motion: a Visual Cue to Collision

    Get PDF
    Brains and sensory systems evolved to guide motion. Central to this task is controlling the approach to stationary obstacles and detecting moving organisms. Looming has been proposed as the main monocular visual cue for detecting the approach of other animals and avoiding collisions with stationary obstacles. Elegant neural mechanisms for looming detection have been found in the brain of insects and vertebrates. However, looming has not been analyzed in the context of collisions between two moving animals. We propose an alternative strategy, Generalized Regressive Motion (GRM), which is consistent with recently observed behavior in fruit flies. Geometric analysis proves that GRM is a reliable cue to collision among conspecifics, whereas agent-based modeling suggests that GRM is a better cue than looming as a means to detect approach, prevent collisions and maintain mobility

    Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications

    Get PDF
    Trained on large datasets, deep learning (DL) can accurately classify videos into hundreds of diverse classes. However, video data is expensive to annotate. Zero-shot learning (ZSL) proposes one solution to this problem. ZSL trains a model once, and generalizes to new tasks whose classes are not present in the training dataset. We propose the first end-to-end algorithm for ZSL in video classification. Our training procedure builds on insights from recent video classification literature and uses a trainable 3D CNN to learn the visual features. This is in contrast to previous video ZSL methods, which use pretrained feature extractors. We also extend the current benchmarking paradigm: Previous techniques aim to make the test task unknown at training time but fall short of this goal. We encourage domain shift across training and test data and disallow tailoring a ZSL model to a specific test dataset. We outperform the state-of-the-art by a wide margin. Our code, evaluation procedure and model weights are available at this http URL

    Estimating Causal Direction and Confounding of Two Discrete Variables

    Get PDF
    We propose a method to classify the causal relationship between two discrete variables given only the joint distribution of the variables, acknowledging that the method is subject to an inherent baseline error. We assume that the causal system is acyclicity, but we do allow for hidden common causes. Our algorithm presupposes that the probability distributions P(C) of a cause C is independent from the probability distribution P(E∣C) of the cause-effect mechanism. While our classifier is trained with a Bayesian assumption of flat hyperpriors, we do not make this assumption about our test data. This work connects to recent developments on the identifiability of causal models over continuous variables under the assumption of "independent mechanisms". Carefully-commented Python notebooks that reproduce all our experiments are available online at http://vision.caltech.edu/~kchalupk/code.html

    Multi-Level Cause-Effect Systems

    Get PDF
    We present a domain-general account of causation that applies to settings in which macro-level causal relations between two systems are of interest, but the relevant causal features are poorly understood and have to be aggregated from vast arrays of micro-measurements. Our approach generalizes that of Chalupka et. al. (2015) to the setting in which the macro-level effect is not specified. We formalize the connection between micro- and macro-variables in such situations and provide a coherent framework describing causal relations at multiple levels of analysis. We present an algorithm that discovers macro-variable causes and effects from micro-level measurements obtained from an experiment. We further show how to design experiments to discover macro-variables from observational micro-variable data. Finally, we show that under specific conditions, one can identify multiple levels of causal structure. Throughout the article, we use a simulated neuroscience multi-unit recording experiment to illustrate the ideas and the algorithms

    Fast Conditional Independence Test for Vector Variables with Large Sample Sizes

    Get PDF
    We present and evaluate the Fast (conditional) Independence Test (FIT) -- a nonparametric conditional independence test. The test is based on the idea that when P(X∣Y,Z)=P(X∣Y)P(X \mid Y, Z) = P(X \mid Y), ZZ is not useful as a feature to predict XX, as long as YY is also a regressor. On the contrary, if P(X∣Y,Z)≠P(X∣Y)P(X \mid Y, Z) \neq P(X \mid Y), ZZ might improve prediction results. FIT applies to thousand-dimensional random variables with a hundred thousand samples in a fraction of the time required by alternative methods. We provide an extensive evaluation that compares FIT to six extant nonparametric independence tests. The evaluation shows that FIT has low probability of making both Type I and Type II errors compared to other tests, especially as the number of available samples grows. Our implementation of FIT is publicly available

    A Framework for Evaluating Approximation Methods for Gaussian Process Regression

    Get PDF
    Gaussian process (GP) predictors are an important component of many Bayesian approaches to machine learning. However, even a straightforward implementation of Gaussian process regression (GPR) requires O(n^2) space and O(n^3) time for a data set of n examples. Several approximation methods have been proposed, but there is a lack of understanding of the relative merits of the different approximations, and in what situations they are most useful. We recommend assessing the quality of the predictions obtained as a function of the compute time taken, and comparing to standard baselines (e.g., Subset of Data and FITC). We empirically investigate four different approximation algorithms on four different prediction problems, and make our code available to encourage future comparisons
    corecore